Journal of the Association for Research in Otolaryngology
○ Springer Science and Business Media LLC
Preprints posted in the last 90 days, ranked by how well they match Journal of the Association for Research in Otolaryngology's content profile, based on 11 papers previously published here. The average preprint has a 0.01% match score for this journal, so anything above that is already an above-average fit.
Tripathy, S.; Budak, M.; Maddox, R.; Mehta, A. H.; Roberts, M. T.; Corfas, G.; Booth, V.; Zochowski, M.
Show abstract
Hidden hearing loss (HHL) is an auditory neuropathy characterized by altered auditory nerve responses despite normal hearing thresholds. Recent experimental and computational studies suggest that permanent disruptions to heminode positions in spiral ganglion neuron (SGN) fibers can contribute to these deficits. However, the interaction between heminode disruption and noisy backgrounds ubiquitous in daily listening remains unexplored. This study investigates how background noise affects auditory processing with these peripheral disorders and how deficits propagate to downstream sound localization circuits in the superior olivary complex. We developed computational models of SGN fibers with mild and severe degrees of heminode disruption, subjected to sinusoidal tone stimuli in the presence of background noise with varying spectral characteristics. We analyzed the phase-locking of SGN fiber responses to the stimulus tone and modeled the subsequent effects on interaural time difference (ITD) sensitivity in the medial superior olive (MSO) using a binaural localization network. We found that near-tone-frequency noise disrupted SGN phase locking through cycle-to-cycle variability in spike phases, with effects consistent across tone frequencies. Mild heminode disruption produced frequency-dependent degradation in SGN phase locking, with effects observed only at higher frequencies tested (600-1000 Hz), without reducing overall firing rates. Critically, the effects of noise and heminode disruption were additive, with combined exposure leading to reduced ITD sensitivity and large temporal fluctuations in MSO responses. Severe heminode disruption, which additionally reduced firing rates at the SGN fibers and subsequent stages, produced profound localization deficits across all frequencies tested. Thus, our model results suggest that noisy environments exacerbate auditory deficits from peripheral disorders implicated in HHL and could potentially impair speech intelligibility through degradation in localization ability. This model may be useful for understanding the downstream impacts of SGN neuropathies.
Kamau, A. F.; Merchant, G. R.; Nakajima, H. H.; Neely, S. T.
Show abstract
Conductive hearing loss (CHL) with a normal otoscopic exam can be difficult to diagnose because routine clinical measures such as audiometric air-bone gaps (ABGs) can identify a conductive component but often cannot distinguish among specific underlying mechanical pathologies (e.g., stapes fixation versus superior canal dehiscence, which may produce similar audiograms). Wideband tympanometry (WBT) is a fast, noninvasive test that can provide additional mechanical information across a broad range of frequencies (200 Hz to 8 kHz). However, WBT metrics are influenced by variations in ear canal geometry and probe placement and can be challenging to interpret clinically. In this study, we extend prior WBT absorbance-based classification work by estimating the middle ear input impedance at the tympanic membrane (ZME), a WBT-derived metric intended to reduce ear canal effects. To estimate ZME, we fit an analog circuit model of the ear canal, middle ear, and inner ear to raw WBT data collected at tympanometric peak pressure (TPP). Data from 27 normal ears, 32 ears with superior canal dehiscence, and 38 ears with stapes fixation were analyzed. A multinomial logistic regression classifier was trained using principal component analysis (retaining 90% variance) and stratified 5-fold cross-validation with regularization. We compared feature sets based on ABGs alone, ABGs combined with absorbance, and ABGs combined with the magnitude of ZME. The combination of ABGs and the magnitude of ZME produced the best performance, achieving an overall accuracy of 85.6% compared to 80.4% for ABGs alone and 78.4% for ABGs combined with absorbance. These results suggest that incorporating model-derived middle ear impedance features with standard audiometric measures (ABGs) can improve automated pathology classification for stapes fixation and superior canal dehiscence.
Wouters, M.; Gaudrain, E.; Dapper, K.; Schirmer, J.; Baskent, D.; Ruettiger, L.; Knipper, M.; Verhulst, S.
Show abstract
Speech perception difficulties in noise are common among older adults and individuals with hearing impairment, even when audiometric thresholds appear normal. We examined how aging, cochlear synaptopathy (CS), and outer hair cell (OHC) damage affect speech encoding and phoneme discrimination. Envelope-following responses (EFRs) to rectangular amplitude-modulated (RAM) tones and speech-like phoneme pairs were recorded in quiet using EEG, and behavioral discrimination was assessed in quiet, ipsilateral, and contralateral noise. Stimuli were designed to target temporal envelope (TENV) or temporal fine structure (TFS) encoding. Results showed that RAM-EFR amplitudes decreased gradually with age, consistent with emerging CS, while magnitudes of high-frequency TENV-based EFRs in quiet were most reduced in older hearing-impaired listeners with combined CS and OHC damage. In contrast, EFRs targeting low-frequency TENV encoding in quiet remained preserved. Behaviorally, phoneme discrimination of TFS contrasts worsened with OHC loss and age in quiet and contralateral noise, respectively, while there was no significant effect of age on the discrimination of TENV contrasts. Considering that high-frequency contrasts are discriminated via place-based spectral cues, low-frequency contrasts rely on TFS, and the EFR reflects primarily TENV, this framework explains why EFRs decline for high-frequency cues without perceptual loss, while EFRs remain stable for low-frequency cues even as TFS-based discrimination deteriorates. These findings highlight the need for further investigation into how neural coding deficits relate to perceptual outcomes. Combining electro-physiological and behavioral measures might provide a sensitive framework for detecting subclinical auditory deficits to earlier diagnose age-related and hidden hearing loss. HighlightsO_LISpeech-evoked EEG shows OHC loss-related decline of high-CF enve- lope encoding. C_LIO_LISpeech-evoked EEG shows low-CF envelope encoding stays intact with age. C_LIO_LIFine-structure contrast discrimination worsens with OHC loss in quiet. C_LIO_LIFine-structure contrast discrimination worsens with age in contralateral noise. C_LIO_LIHigh-frequency place-based spectral cues discrimination remains robust with age. C_LIO_LIPeripheral coding strength is not directly reflected at behavioral level. C_LI
Rias, E.; Ouwerkerk, I.; Spitzmaul, G.; Dionisio, L.
Show abstract
The medial olivocochlear (MOC) efferent system modulates outer hair cell (OHC) excitability and protects cochlea from overstimulation. Cholinergic activation of 910 nicotinic acetylcholine receptors (nAChRs) triggers Ca{superscript 2} influx, activating BK and SK2 Ca{superscript 2}-dependent K channels, and K extrusion through KCNQ4 to restore membrane potential. KCNQ4-loss causes chronic depolarization, OHC dysfunction, and hearing loss. Here, we investigated how KCNQ4 deficiency affects cochlear efferent synapse development and organization. Using confocal immunofluorescence, we analyzed efferent innervation in the organ of Corti of Kcnq4-/- (KO) and Kcnq4+/+(WT) mice at 2, 3, 4, and 10 postnatal weeks (W). At 2 W, efferent terminals were similarly distributed between basal and lateral OHC membrane domains in both genotypes. During maturation, WT mice exhibited complete relocation of MOC terminals to the basal domain, whereas KO mice showed delayed maturation, with some terminals laterally displaced up to 10 W. KCNQ4 absence was associated with reduced number and volume of efferent boutons on OHCs. Milder morphometric alterations were observed in efferent boutons within the inner hair cell region. At the molecular level, qPCR revealed downregulation of 10 nAChR subunit, BK, and SK2 transcripts in KO at 4 W, with recovery to 10 W. Despite this recovery, BK protein showed reduced expression, mislocalization, and disorganized synaptic plaques in OHCs. KO also displayed age-dependent upregulation of the calcium-binding proteins calbindin and calretinin, suggesting compensatory responses to altered Ca+{superscript 2} homeostasis. Together, these findings demonstrate that KCNQ4 is essential for OHC repolarization, maturation and maintenance of cochlear efferent synapses. Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=179 HEIGHT=200 SRC="FIGDIR/small/700803v1_ufig1.gif" ALT="Figure 1"> View larger version (52K): org.highwire.dtl.DTLVardef@134e2caorg.highwire.dtl.DTLVardef@1155f45org.highwire.dtl.DTLVardef@21b4ccorg.highwire.dtl.DTLVardef@e4ee62_HPS_FORMAT_FIGEXP M_FIG C_FIG
Augsten, M.-L.; Lindenbeck, M. J.; Laback, B.
Show abstract
Cochlear implant (CI) users typically experience difficulties perceiving musical harmony due to a restricted spectro-temporal resolution at the electrode-nerve interface, resulting in limited pitch perception. We investigated how stimulus parameters affect discrimination of complex-tone triads (three-voice chords), aiming to identify conditions that maximize perceptual sensitivity. Six post-lingually deafened CI listeners completed a same/different task with harmonic complex tones, while spectral complexity, voice(s) containing a pitch change, and temporal synchrony (simultaneous vs. sequential triad presentation) were manipulated. CI listeners discriminated harmonically relevant one-semitone pitch changes within triads when spectral complexity was reduced to three or five components per voice, with significantly better performance for three-component compared to nine-component tones. Sensitivity was observed for pitch changes in the high voice or in both high and low voices, but not for changes in only the low voice. Single-voice sensitivity predicted simultaneous-triad sensitivity when controlling for spectral complexity and voice with pitch change. Contrary to expectations, sequential triad presentation did not improve discrimination. An analysis of processor pulse patterns suggests that difference-frequency cues encoded in the temporal envelope rather than place-of-excitation cues underlie perceptual triad sensitivity. These findings support reducing spectral complexity to enhance chord discrimination for CI users based on temporal cues.
Neely, S. T.; Harris, S. E.; Hajicek, J. J.; Petersen, E. A.; Shen, Y.
Show abstract
In a loudness-matching paradigm, a reduction in the loudness of sounds with bandwidths less than one-half octave compared to a tone of equal sound pressure level has been observed previously for five-tone complexes at 60 dB SPL centered at 1 kHz. Here, this loudness-reduction phenomenon is explored using band-limited noise across wide ranges of frequency and level. Additionally, these measurements are simulated by a model of loudness judgement based on neural ensemble averaging (NEA), which serves as a proxy for central auditory signal processing. Multi-frequency equal-loudness contours (ELC) were measured for each of the adult participants (N=100) with pure-tone average (PTA) thresholds that ranged from normal to moderate hearing loss using a categorical-loudness-scaling (CLS) paradigm. Presentation level and center frequency of the test stimuli were determined on each trial according to a Bayesian adaptive algorithm, which enabled multi-frequency ELC estimation within about five minutes of testing. Three separate test conditions differed by stimulus type: (1) pure-tone, (2) quarter-octave noise and (3) octave noise. For comparison, loudness judgements for all three stimulus types were also simulated by the NEA model, which comprised a nonlinear, active, time-domain cochlear model with an appended stage of neural spike generation. Mid-bandwidth loudness reduction was observed to be greatest at moderate stimulus levels and frequencies near 1 kHz. This feature was approximated by the NEA model, which suggests involvement of an early stage of the central auditory system in the formation of loudness judgements.
Sotero Silva, N.; Bröhl, F.; Kayser, C.
Show abstract
Eye-movement-related eardrum oscillations (EMREOs) are pressure changes recorded in the ear that supposedly reflect displacements of the tympanic membrane induced by saccadic eye movements. Previous studies hypothesized that the underlying mechanisms might play a role in combining visual and acoustic spatial information. Yet, whether and how the eardrum moves during an EMREO and whether this movement affects acoustic spatial perception remains unclear. We here probed human acoustic lateralization performance for sounds presented at different times during a saccade (hence the EMREO) in two tasks, one relying on free-field sounds and one presenting sounds in-ear. Since the EMREO generation likely involves the middle ear muscles, whose tension can alter sound transmission, it is possible that judgements of sound locations may vary with the state of the ERMEO at the time of sound presentation. However, when testing two specific hypotheses of how movements of the eardrum underlying the EMREO may affect spatial hearing, we found no evidence in support of this. Still, and in line with previous studies, we found that participants lateralization responses were shaped by the spatial congruency of the saccade target direction and the sound direction. Thus, either the eardrum does not move directly as reflected by the EMREO signal, or despite its movement the underlying changes at the tympanic membrane only have minimal perceptual impact. Our results call for more refined studies to understand how the eardrum moves during a saccade and whether or how the EMREO impacts spatial perception.
Caro, A. M.; Zhang, Z.; Gansemer, B. M.; Green, S. H.
Show abstract
AO_SCPLOWBSTRACTC_SCPLOWSpiral ganglion neurons (SGNs) constitute the sole afferent connection between cochlear hair cells and central auditory nuclei. SGNs die during postnatal developmental pruning, and also following hair cell death, which can be triggered by ototoxic agents such as aminoglycoside antibiotics, including kanamycin. After hair cell loss, animal models show extensive SGN degeneration occurring gradually over a period of weeks to months. Here, we compared spatial and temporal patterns of SGN loss and immune cell involvement in these two cases of cell death in rats. Developmental SGN pruning occurred from postnatal day 5 (P5) to P8 in the basal half of the cochlea, and from P5 to P12 in the apical half. This was accompanied by a transient increase in spiral ganglion macrophages temporally and spatially correlated with SGN death, consistent with a role clearing degenerating neurons. After deafening neonatal rats with kanamycin injections, SGN death became evident at approximately 5.5 weeks of age and persisted throughout the ganglion, with greatest loss in the middle regions; less in the base and apex. Macrophage numbers also increased but neither temporally nor spatially correlated with SGN death. Rather, increased macrophage number and activation began approximately three weeks before SGN death and was highest in the apex. Additionally, T-cells and NK cells appeared in the ganglion concurrently with SGN degeneration. These observations suggest fundamentally different roles for macrophages post-deafening than during developmental pruning and, with prior observations that anti-inflammatory drugs reduce SGN death, support a causal role for immune responses in SGN death post-deafening.
King, C. D.; Zhu, T.; Groh, J. M.
Show abstract
Information about eye movements is necessary for linking auditory and visual information across space. Recent work has suggested that such signals are incorporated into processing at the level of the ear itself (Gruters, Murphy et al. 2018). Here we report confirmation that the eye movement signals that reach the ear can produce perceptual consequences, via a case report of an unusual participant with tensor tympani myoclonus who hears sounds when she moves her eyes. The sounds she hears could be recorded with a microphone in the ear in which she hears them (left), and occurred for large leftward eye movements to extreme orbital positions of the eyes. The sounds elicited by this participants eye movements were reminiscent of eye movement-related eardrum oscillations (EMREOs, (Gruters, Murphy et al. 2018, Brohl and Kayser 2023, King, Lovich et al. 2023, Lovich, King et al. 2023, Lovich, King et al. 2023, Abbasi, King et al. 2025, Sotero Silva, Kayser et al. 2025, King and Groh 2026, Leon, Ramos et al. 2026, Sotero Silva, Brohl et al. 2026)), but were larger and longer lasting than classical EMREOs, helping to explain why they were audible to her. Overall, the observations from this patient help establish that (a) eye movement-related signals specifically reach the tensor tympani muscle and that (b) when there is an abnormality involving that muscle, such signals can lead to actual audible percepts. Given that the tensor tympani contributes to the regulation of sound transmission in the middle ear, these findings support that eye movement signals reaching the ear have functional consequences for auditory perception. The findings also expand the types of medical conditions that produce gaze-evoked tinnitus, to date most commonly observed in connection with acoustic neuromas.
Borrajo, M.; Callejo, A.; CASTELLANOS, E.; Amilibia, E.; Llorens, J.
Show abstract
Vestibular schwannomas (VS) cause vestibular function loss by mechanisms still poorly understood. We evaluated the vestibulo-ocular reflex by the video-assisted Head Impulse Test (vHIT) in patients with planned tumour resection by a trans-labyrinthine approach. The vestibular sensory epithelia were collected and processed by immunofluorescent labelling for confocal microscopy analysis of sensory hair cell subtypes (type I, HCI, and type II, HCII), calyx endings of the pure-calyx afferents, and the calyceal junction normally found between HCI and the calyx (n=23). Comparing Normofunction and Hypofunction patients, we concluded that worse vestibular function associates with decreased HCI and HCII counts in the sensory epithelia and with increased proportion of damaged calyces. A decrease in the number of HCI and calyx endings of the pure-calyx afferents was recorded to associate with age increase. Partial least squares regression (PLSR) models indicated that VS and age had independent, additive effects on vestibular function. Correlation analyses indicated that lower vHIT gains associate with lower numbers of HCI and increased percentages of damaged calyces. These data support the hypothesis that the deleterious effect of VS on vestibular function is mediated, at least in part, by its damaging impact on the vestibular sensory epithelium. They also provide further evidence for the dependency of the vestibulo-ocular reflex on HCI function and for the calyceal junction pathology as a common response of the sensory epithelium to HC stress.
Palou, A.; Tagliabue, M.; Beraneck, M.; Llorens, J.
Show abstract
The rat vestibular system plays a critical role in anti-gravity responses such as the tail-lift reflex and the air-righting reflex. In a previous study in male rats, we obtained evidence that these two reflexes depend on the function of non-identical populations of vestibular sensory hair cells (HC). Here, we caused graded lesions in the vestibular system of female rats by exposing the animals to several different doses of an ototoxic chemical, 3,3-iminodipropionitrile (IDPN). After exposure, we assessed the anti-gravity responses of the rats and then assessed the loss of type I HC (HCI) and type II HC (HCII) in the central and peripheral regions of the crista, utricle and saccule. As expected, we recorded a dose-dependent loss of vestibular function and loss of HCs. The relationship between hair cell loss and functional loss was examined using non-linear models fitted by orthogonal distance regression. The results indicated that both the tail-lift reflex and the air-righting reflexes mostly depend on HCI function. However, a different dependency was found on the epithelium triggering the reflex: while the tail-lift response is sensitive to loss of crista and/or utricle HCIs, the air-righting response rather depends on utricular and/or saccular integrity.
Xue, W.; Sun, N.; Wood, E.; Xie, J.; Liu, X.; Yan, J.
Show abstract
Prolonged exposure to loud and moderate noise impairs hearing; the lower the noise level, the lower risk of hearing loss is. To date, little is known about how low the noise level can be safe to hearing. This study investigated the risk of exposure to tone at typical conversational levels by measuring auditory brainstem response (ABR). We show that exposing C57 mice to continuous pure tone at 65 dB SPL for 1 hour (TE65) leads to an increase in ABR threshold that is specific to the exposure frequency. Tone exposure also increased the latencies and decreased the amplitude in Waves I and II but not in Waves III and V. Significantly, the changes in amplitude and latency were highly correlated in Wave I and such correlation gradually degraded from Wave I through to Wave V. Our findings suggest that exposure to low level sound can impair hearing and alter the auditory information process in the brain if it is persistent and presented over a sufficient period of time. Significant StatementOur findings established the risk of hearing impairment following the exposure to continuous tone at normal or conversational voice levels. This finding challenges current public health guidelines for hearing protection. Although further clarification is required, our studies prompt that the regular use of ABR testing is a potential protocol for diagnosing hearing impairment in patients experiencing hidden hearing loss (HHL).
Motlagh Zadeh, L.; Izhiman, D.; Blankenship, C. M.; Moore, D. R.; Martin, D. K.; Garinis, A.; Feeney, P.; Hunter, L. R.
Show abstract
Objectives: Patients with Cystic fibrosis (CF) often receive aminoglycosides (AGs) to manage recurrent pulmonary infections, placing them at risk for ototoxicity. Chronic AG use can lead to complex cochlear damage affecting inner and outer hair cells, the stria vascularis, and spiral ganglion neurons. The greatest damage is typically in the basal cochlear region, which encodes high-frequency hearing, with additional involvement of more apical regions. While extended-high-frequency (EHF) hearing loss (EHFHL; 9-16 kHz) is often the earliest sign of AG ototoxicity, speech in noise (SiN) effects are rarely studied. Our overall hypothesis is that SiN perception difficulties in individuals with CF, treated with AGs, are related to combined cochlear and neural damage, primarily in the EHF range but also in the standard frequency (SF; 0.25-8 kHz) range. Three mechanisms that contribute to SiN perception were evaluated in children and young adults: 1) a primary effect of reduced EHF sensitivity, measured by pure-tone audiometry (PTA) and transient-evoked otoacoustic emissions (TEOAEs); 2) a secondary effect of subclinical damage in the SF range, measured by PTA and TEOAEs; and 3) additional neural effects, measured by middle ear muscle reflex (MEMR) threshold (afferent) and growth functions (efferent).Design:A total of 185 participants were enrolled; 101 individuals with CF treated with intravenous AGs and 84 age and sex-matched Controls without hearing concerns or CF. Assessments included EHF and SF PTA; the Bamford-Kowal-Bench (BKB)-SIN test for SiN perception; double-evoked TEOAEs with chirp stimuli from 0.71 to 14.7 kHz; and ipsilateral and contralateral wideband MEMR thresholds and growth functions using broadband stimuli. Results: Reduced sensitivity at EHFs (PTA, TEOAEs) was not associated with impaired SiN perception in the CF group. SF hearing, regardless of EHF status, was the primary predictor of SiN performance in the CF group. Increased MEMR growth was also significantly associated with poorer SiN in the CF group. Conclusions: In CF, impaired SiN perception was primarily predicted by SF hearing impairment, with additional involvement of the efferent auditory pathway through increased MEMR growth. These results build on prior evidence for efferent neural effects due to ototoxic exposures, supporting both sensory (afferent) and neural (efferent) mechanisms that contribute to listening difficulties in CF. Thus, preventive and intervention strategies should consider these combined mechanisms in people with AG ototoxicity to address their SiN problems.
MacLean, J.; Zhou, M.; Bidelman, G.
Show abstract
Entrainment and predictive coding aid speech perception in both quiet and noisy environments. Isochronous, periodic auditory rhythmic cues facilitate entrainment and temporal expectations which can benefit encoding and perception of target speech. However, most studies using isochronous cues confound periodicity with predictability. To this end, we characterized how systematic changes in the acoustic dimensions of stimulus rate, target phase, periodicity, and predictably of an entraining sound precursor impact the subsequent identification of concurrent speech targets. Target concurrent vowel pairs were preceded by rhythmic woodblock cues which were either periodic-predictable (PP, isochronous rhythm), aperiodic-predictable (AP, accelerating rhythm), or aperiodic-unpredictable (AU, random rhythm). The number of pulses per rhythm was roved to further manipulate predictability. Stimuli also varied in presentation rate (2.5, 4.5, 6.5 Hz) and target speech phase (in-phase, 0{degrees}; out-of-phase, 90{degrees}, 180{degrees}) relative to the preceding entraining rhythm. We also measured participants musical pulse continuation and standardized speech-in-noise perception abilities. We did not observe any effects of stimulus rhythm, rate, or target phase on target speech identification accuracy. However, reaction times were slowest at the nominal speech rate (4.5 Hz) and were most disrupted by out-of-phase presentations following the PP rhythm. Double-vowel task performance was associated with stronger musical pulse continuation abilities, but not speech-in-noise perception. Our results support the notion that entraining rhythmic cues rely on top-down processing but are relatively muted when stimulus predictability is unknown. Additionally, we find that individual differences in musical pulse perception may underlie the benefits of rhythmic cueing on subsequent speech perception.
Jedrzejczak, W.; Kochanek, K.; Skarzynski, H.
Show abstract
Introduction: Auditory brainstem response (ABR) is a standard objective method for estimating hearing threshold, especially in patients who cannot reliably participate in behavioral audiometry. However, ABR interpretation is usually performed by an expert. This study evaluated whether two general-purpose artificial intelligence (AI) multimodal large language model (LLM) chatbots, ChatGPT and Qwen, can accurately estimate ABR hearing thresholds from ABR waveform images. The accuracy was measured by comparisons with the judgements of 3 expert audiologists. Methods: A total of 500 images each containing several ABR waveforms recorded at different stimulus intensities were analyzed. Three expert audiologists established the reference auditory thresholds based on visual identification of wave V at the lowest stimulus intensity, with the most frequent judgment among the three used as the reference. Each waveform image was independently submitted to ChatGPT (version 5.1) and Qwen (version 3Max) using the same standardized prompt and without additional clinical context. Agreement with the expert thresholds was assessed as mean errors and correlations. Sensitivity and specificity for detecting hearing loss (>20 dB nHL) were also calculated. In cases where the AI and expert thresholds nominally matched, corresponding latency measures were also compared. Results: Auditory thresholds derived from both LLMs correlated strongly with expert opinion, with Pearson r = 0.954 for ChatGPT and r = 0.958 for Qwen. ChatGPT showed a mean error of +5.5 dB and Qwen showed a mean error of -2.7 dB. Exact nominal agreement with expert values was achieved in 34.6% of ChatGPT estimates and 35.6% of Qwen estimates; agreement within +/-10 dB was observed in 75.6% and 80.0% of cases, respectively. For hearing-loss classification, ChatGPT achieved 100% sensitivity but low specificity (20.4%), whereas Qwen showed a more balanced profile with 91.6% sensitivity and 67.5% specificity. Curiously, estimates of wave V latency were markedly poor for both LLMs, with systematic underestimation and weak correlations with the expert judgements. Conclusion: ChatGPT and Qwen demonstrated a moderate ability to estimate ABR thresholds from waveform images, although their performance was not good enough for independent clinical use. Both models captured general patterns of hearing loss severity, but there was systematic bias, limited specificity and sensitivity balance, and poor latency estimation. General-purpose multimodal LLMs may have potential as assistive or preliminary tools, but clinically reliable ABR interpretation will likely require specialized, domain-trained AI systems with expert oversight.
Gastaldon, S.; Gheller, F.; Bonfiglio, N.; Brotto, D.; Bottari, D.; Trevisi, P.; Martini, A.; Vespignani, F.; Peressotti, F.
Show abstract
This study provides the first neurophysiological evidence of how cochlear implant (CI) input affects predictive processing during audiovisual language comprehension in deaf individuals. Using EEG, we compared 18 CI users with 18 normal-hearing (NH) controls during sentence comprehension where final word predictability was determined by high or low semantic constraint (HC vs. LC) of the preceding sentence frame. Between sentence frame and final word, a 800 ms silent gap was introduced. Mouth visibility was manipulated during sentence frames (visible or digitally occluded; V+ vs. V-), while the final words were always presented with the mouth visible. In NH participants, lower-beta power (12-15 Hz) in left frontal and central sensors decreased for HC vs. LC contexts during the pre-target silent gap, but only when the mouths was visible, suggesting active prediction generation. In CI users, this lower beta power decrease was absent. After final word presentation, both groups showed N400 predictability effects, indicating preserved prediction evaluation. However, CI users exhibited extended N400 effects in the V+ condition, suggesting additional processing demands. Across all participants, pre-target beta modulations correlated with language production abilities, supporting prediction-by-production frameworks. Within CI users, poorer audiometric thresholds correlated with larger N400 constraint effects, possibly indicating greater reliance on contextual prediction to compensate for degraded sensory input. These findings demonstrate that CI-mediated perception alters the neural mechanisms of prediction generation. The link between production skills and predictive mechanisms suggests that strengthening expressive language abilities may enhance predictive processing in CI users.
Wong, K. H.; Strimbu, C. E.; Olson, E. S.
Show abstract
Optical coherence tomography (OCT) has allowed in vivo recording of sound-induced vibrations of different regions within the organ of Corti complex (OCC), including the basilar membrane (BM), outer hair cell/Deiters cell (OHC/DC) region, and reticular lamina (RL). In the hook region of the gerbil cochlea, where measurements can be made with a substantially transverse optical axis, the three regions have different and characteristic motion responses: The OHC/DC region has greater motions than the other two regions at frequencies below the best frequency (sub-BF); the RL region typically has the greatest BF peak and smallest sub-BF motion. The phase of the OHC/DC-region motion increasingly lags BM motion phase as frequency increases; the RL-region motion phase leads BM, but with a relatively small value. All three regions are compressively nonlinear in the BF peak, but only the OHC/DC region shows sub-BF compressive nonlinearity. In this paper, we describe the strain that exists within the RL and OHC-body regions. These strains are large where the motion varies over short distances, and a region of large strain can be as short as a single 2.7 {micro}m measurement pixel, or extend over several pixels, with the extensive strains appearing more often at 70 than at 50 dB SPL. Beyond the region of large strain, over a distance that can exceed 20 m, the OHC/DC region displays nearly unvarying motion spatially -- this region appears to vibrate as a body. Statement of SignificanceThe sensory tissue of the cochlea responds actively to a sound stimulus: cell-based forces amplify and enhance the vibration of the sensory tissue. Measurements employing optical coherence tomography have identified major vibration patterns along a sensory-tissue-spanning line that includes the active outer hair cells. In this article, we describe the transitional motion between these major vibration regions and the motion strains that exist as vibration morphs from one region to the next. The findings are presented in frequency response curves to convey the frequency tuning and its stimulus-level dependence, and in one-dimensional heat maps to convey the extent of regional motions and strains. These findings fuel and constrain conceptual and physics-based models of cochlear amplification.
Liu, G. S.; Ali, N.-E.-S.; O Maoileidigh, D.
Show abstract
The neural response of the brainstem to brief sounds, known as the auditory brainstem response (ABR), is widely employed in the laboratory and the clinic to diagnose hearing loss. In contrast to behavioral methods that assess hearing using responses to sounds on a trial-by-trial basis, current ABR approaches are limited to analyzing the average ABR over hundreds of trials. Historically, trial-by-trial ABR analysis has not been possible owing to each trials small signal-to-noise ratio. Here we overcome this limitation and show how to classify individual ABR trials as detected or undetected. We use the distribution of single-trial ABRs to assess supra-threshold hearing and to define psychophysics-like thresholds, which we call auditory brainstem detection (ABD) thresholds. ABD thresholds decrease as more of the ABR epoch is taken into account, whereas traditional ABR thresholds do not change. Above the ABD thresholds and below 90 dB SPL, signal detection is significantly improved by utilizing more of the ABR epoch. Our method also allows us to rank the supra-threshold hearing ability of individual subjects. Despite having normal ABR thresholds, some subjects appear to have supra-threshold hearing deficits. The trial-by-trial method demonstrates that signal detection by the ensemble of auditory neurons in the brainstem is intrinsically stochastic not only at low stimulus levels, but also at levels up to 100 dB SPL. Significance StatementNeural responses to sound can be measured by electrodes placed on a subjects head and are commonly used in the laboratory and the clinic to assess hearing. Although the auditory system must distinguish each sound stimulus from intrinsic noise, current methods for ana-lyzing the response of the brainstem to sound only utilize the average response to hundreds of stimuli. Here we overcome this constraint by showing how to classify an individual sound stimulus as detected or undetected based on each auditory brainstem response. This ap-proach can assess hearing at all stimulus levels, indicates that subjects with normal hearing thresholds can exhibit supra-threshold hearing loss, and potentially extends the types of hearing deficits that can be diagnosed using auditory evoked potentials.
Manasevich, V.; Kostanian, D.; Rogachev, A.; Sysoeva, O.
Show abstract
Rise time (RT) is considered to be one of the most significant acoustical characteristics of auditory speech stimuli. A substantial amount of data has been accumulated on the neurophysiological mechanisms of RT processing under different conditions and in different groups of people, but these data have not been systematised. This review focuses on studies that have investigated electroencephalographic (EEG) markers of RT sensitivity. The present literature search was conducted according to the PRISMA statement in PubMed, Web of Science and APA PsychInfo databases. The resultant review comprised 37 studies that considered diverse aspects of RT processing. The review describes the main stimulation parameters affecting electrophysiological markers of RT processing reflected in different components of event-related potentials, brainstem responses and cortical rhythmic activity. The main finding of this review is that the rise time prolongation leads to a decrease in the amplitude of the main ERP components and an increase in their latencies. However, the sensitivity of the EEG markers varied with the earliest components tracking the subtle difference (few tens of microseconds), while the later components coding the larger one (up to 500 ms). Nevertheless, the observed effects may vary and depend on some aspects of the experimental paradigm, age of participants and speech-related problems. Future research may benefit by addressing understudied clinical groups and ERP components such as P1 and N2, dominated in children.
Hunter, L. L.; Feeney, M. P.; Fitzpatrick, D.; Keefe, D. H.
Show abstract
ObjectivesThe overall goal of this study was to assess tympanometric and ambient wideband acoustic immittance (WAI) tests and wideband acoustic reflex thresholds (ART) in well-baby and newborn intensive care (NICU) cohorts with three specific objectives: 1) Assess predictive accuracy for WBT and ART for conductive dysfunction in ears referring on the first or second stages of newborn hearing screening; 2) Identify inadequate tests likely due to probe blockages or leaks; and 3) Assess prediction models separately for well-baby and NICU screening outcomes. DesignProspective, observational study of full-term (n=514) and premature newborns (n=239) recruited from well-baby and NICU nursery birth hospital newborn hearing screening program. Wideband tympanometry, ambient absorbance, and acoustic reflexes were tested after Stage 1 transient otoacoustic emissions (TEOAE) screening. The reference standard for Pass or Refer groups was initially defined on the stage 1 TEOAE test result. Pass or Refer groups were then reassigned based on the stage 2 screening ABR for those who referred at Stage 1, and all NICU infants. Multivariate models were developed using reflectance and admittance variables to predict conductive dysfunction relative to the screening reference standard in a randomized sub-group of subjects at Stage 1 and Stage 2 screening. Classification accuracy was evaluated on a second, independent sub-group. Individual tests were classified as having inadequate probe fits if they had excessively low values of sound pressure level or susceptance (leak) or absorbance (blockage). ResultsDifferences in ambient absorbance for Pass v. Refer screening groups revealed the greatest differences and effect sizes occurring in frequency bins between 1.4-2 kHz. Screening failure at both Stage 1 and 2 was most accurately predicted by models using ambient absorbance and power level variables at frequencies between 1-2.8 kHz, including ARTs. Tympanometric admittance variables at the positive-pressure tail for frequencies between 1-2.8 kHz in combination with the ART were more accurate predictors than those at peak pressure or the negative-pressure tail. Multivariate models generalized well to an independent group of infants at both Stage 1 and 2 for both the ambient and tympanometric models. Ambient tests revealed more inadequate tests than tympanometric tests, primarily due to blocked probe tips. Exclusion of ears to detect probe leaks or blockages slightly improved the ambient prediction models, but did not affect tympanometric models. ConclusionWideband acoustic reflex tests improved all models for ambient and tympanometric absorbance. Multivariate prediction models developed for WAI tests were repeatable in an independent group of well and NICU infants, suggesting that the results are generalizable to these populations. Detection of probe blockage or leaks slightly improved prediction for ambient measures. Pressurized tests have the advantage of ensuring probe seals due to the need for a hermetic seal, thus are useful to ensure adequate probe insertion.